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1 Introduction 

Observations of pulsars have provided insight into many 
areas of physics and astronomy. Such observations al- 
lowed the discovery of extra-Solar planets (Wolszczan 
& Frail 1992), provided evidence of gravitational wave 
emission (Taylor & Weisberg 1982) and have been used 
to test the general theory of relativity (Kramer et al. 
2006). Pulsars are still being discovered (e.g., Keith 
et al. 2010). These, and previously known pulsars, 
are observed for many research projects with aims as 
diverse as detecting gravitational wave signals (e.g., 
Hobbs et al. 2010), measuring the masses of objects 
in our Solar System (Champion et al. 2010), studying 
the interstellar medium (e.g.. Hill et al. 2003, You et 
al. 2007) and determining the properties of the pulsars 
themselves (e.g., Lyne et al. 2010). 

Many pulsar observations have been obtained us- 
ing National Facility telescopes which have little re- 
striction on who may apply to carry out observations. 



Time on such telescopes is usually awarded on the basis 
of the scientific merit of an observing proposal. Poli- 
cies exist at most of these telescopes to make the re- 
sulting data available for the general scientific commu- 
nity after a specified period. However, because of the 
amount of data, the complexity of the data formats, 
lack of storage space and because pulsar astronomers 
often develop their own hardware for data acquisition, 
it is difficult for non-team members to obtain such data 
sets after the embargo period. 

Numerous new scientific results have resulted from 
re-processing historical data. For instance, a re-analysis 
of a pulsar survey in the Magallenic Clouds led to the 
discovery of a single burst of radio emission that may 
be extra- Galactic in origin (Lorimer et al. 2007). The 
Parkes multibeam pulsar survey (Manchester et al. 
2001) has been re-processed numerous times which, to 
date, has led to the discovery of a further ~30 pulsars 
(Eatough et al. 2010, Keith et al. 2009) and 10 new 
rotating radio transients (Keane et al. 2010). 
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Table 1: Receiver systems used for data in the archive. 



Name 


Labels 


V 




N 


Data span 


N/ 






(MHz) 


(MHz) 






20cm multibeam 


MULTI, MULT.!"- 


1369 


288" 


13 


09/2004-10/2010 


87237 


H-OH 


H-OH 


1405 


256'' 


1 


02/2004-05/2007 


2181 


1050cm'' 


1050CM, lOCM 


3100 


1024 


1 


02/2004 10/2010 


5302 


1050cm'' 


1050CM, 50CM 


732 


64 


1 


02/2004-10/2010 


4112 


70cm 


70CM 


430 


32 


1 


05/1991-12/1994 


42801 



" For some early files the MULT_1 label is used to represent the central beam of the multibeam receiver 
" Recent digital-filterbank systems provide 256 MHz of bandwidth. However, both the multibeam receiver 
and the H-OH receiver can provide wider bandwidths. 
The 1050CM receiver is a dual-band receiver; see text. 



In order to simplify access to astronomical data 
sets the "Virtual Observatory" (VO) was created""^. The 
VO aims to provide protocols for the storage, transfer 
and access of astronomical data and is commonly used 
for astronomical catalogues, images and spectral data. 
The standard data formats used by the VO are the 
VOTable'^ and the Flexible Image Transport System 
(FITS; Hanisch et al. 2001). Hotan, van Straten & 
Manchester (2004) extended FITS to provide a data 
storage structure that is applicable for pulsar data 
(this format is known as PSRFITS). The PSRFITS 
format allows pulsar observations to be analysed using 
VO tools. However, to date, the pulsar community has 
not extensively used such tools. 

We have developed a data archive that will contain 
most of the recoverable pulsar observations made at 
the Parkes Observatory. The data (both the metadata 
describing the observations and the recorded signal 
from the telescope) have all been recorded in, or con- 
verted to, a common standard and the entire archive 
system has VO capabilities. In this paper we first de- 
scribe the observing systems at the Parkes observatory 
(§2), the data formats used and the observations cur- 
rently available from the data archive (§3), tools avail- 
able for searching and accessing the data (§4), software 
that may be used with the data sets (§5) and a descrip- 
tion of the anticipated longer-term development of the 
data archive (§6). 

2 Observing systems 

All data currently available from the archive wore ob- 
tained using the Parkes 64-m radio telescope. The ob- 
serving system used for pulsar observations is typically 
divided into the "frontend" system, which includes the 
receiver and the "backend" system which refers to the 
hardware used to record and process the signal. 

Even though the Parkes telescope allows for mul- 
tiple receivers to lie installed on the telescope simulta- 
neously, only one frontend can be used for a given ob- 
servation. In order to increase the survey speed of the 
telescope various multibeam receivers have been de- 
veloped. For instance, the 20 cm multibeam receiver 
(Staveley-Smith et al. 1996) allows 13 independent 

^http : //www . i voa . net/ 

^http : //www . ivoa . net/Documents/VOTable/ 



patches of the sky to be observed simultaneously (re- 
ferred to as 13 "beams" ) . The changing lines of sight 
to radio pulsars leads to dispersive delays that are 
time-dependent. To remove these delays, simultane- 
ous observations at two widely-spaced frequencies are 
desirable. A dual-band receiver has been developed 
that allows simultaneous observations in the 10 cm and 
50 cm bands (Granet et al. 2005). A listing of the 
receiver systems that have been used for the pulsar 
observations included in the archive are given in Ta- 
ble 1. In column order, we provide the name of the 
receiver, a label describing the receiver, its current cen- 
tral frequency, the maximum bandwidth that the back- 
end instrumentation processed, the rmmber of avail- 
able beams, the data span available and the number 
of files in the archive that made use of this receiver. 
Many of these receivers have been upgraded over time. 
For instance, it was necessary to modify the central ob- 
serving frequency for the 50 cm receiver from 685 MHz 
to 732 MHz because of digital television transmissions. 

In order to maximise the signal-to-noise ratio of 
any pulsar observation it is necessary to observe with 
wide bandwidths. When processing such observations 

it is essential to remove the effect of interstellar dis- 
persion. This is often done by dividing the observing 
bandwidth into frequency channels. However, each 
frequency channel is still affected by the interstellax 
dispersion. It is possible to remove the dispersion en- 
tirely by recording the raw signal voltage and convolv- 
ing with the inverse of the transfer function of the in- 
terstellar medium. This is known as "coherent dedis- 
persion" and, as this is computationally intensive, has 
only recently being applied to data with large (e.g., 
~256 MHz) bandwidths. 

When searching for new pulsars ("search-mode" 
observations), the signal from the telescope is divided 
into multiple frequency channels, digitised and recorded 
at a specified sampling rate. For most of the data 
sets currently in the archive, only one-bit samples are 
recorded and the two polarisation data streams simply 
summed to produce total intensity using an analogue 
interbank system (Manchester et al. 2001). Several 
generations of an analogue filterbank system have ex- 
isted at Parkes. The first generation system is labelled 
"AFB_32_256" and provided a bandwidth of 32 MHz 
and 256 frequency channels. For later generations, the 
backend is simply labelled as the "AFB" . If a pulsar 
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Table 2: Backend instrumentation at Parkes for which data are included in the archive. 



Name 


Label 


Maximum 


Mode 


Data span 


N/ 






Bandwidth (MHz) 








Filterbank 


AFB_32.256 


32 


S 


05/1991- 


-12/1994 


42801 


Analogue filterbank 


AFB 


576 


S 


02/1998 


05/2007 


75304 


Wide band correlator 


WBCORR 


256 


F 


02/2004- 


-03/2007 


2858 


Parkes digital filterbank 1 


PDFBl 


256 


F 


12/2005- 


-01/2008 


2388 


Parkes digital filterbank 2 


PDFB2 


256 


F 


04/2007- 


-05/2010 


2249 


Parkes digital filterbank 3 


PDFB3 


1024 


FS 


03/2008- 


-10/2010 


2886 


Parkes digital filterbank 4 


PDFB4 


1024 


FS 


10/2008- 


-10/2010 


2201 


Caltech-Parkes- 


CPSR2 


2x64 


F 


12/2002- 


-06/2010 


13354 



Swinburne- Recorder 



2 



is discovered in a search-mode file then the same data 
can subsequently be "folded" at the topocentric period 
of the pulsar in order to produce a single pulse profile 
for the pulsar. 

The average of many thousands of individual pulses 
produces an "average pulse profile" that is usually sta- 
ble and is characteristic of the pulsar. As the pulsar's 
period may not be known with sufficient precision (or 
the pulsar may be in a fast binary system) it is common 
to fold only short sections of the data (typically one- 
minute sections) as the data are recorded. Subsequent 
processing can be undertaken to sum these "integra- 
tions" with a more accurate pulsar ephemeris. The 
data archive contains "folded" observations from nu- 
merous observing systems. The Caltech-Parkes-Swin- 
burne-Recorder (CPSR2; Bailes 2003; Hotan 2006) co- 
herently de-dispersed the data and usually produced 
two data files each with 64 MHz of bandwidth. CPSR2 
was decommissioned in June 2010 and replaced by the 
ATNF-Parkes-Swinburnc- Recorder (APSR; van Stratcn 
& Bailes 2010) which provides up to 1 GHz of coher- 
ently de-dispersed data. The archive also includes 
data from a wide-bandwidth correlator and the suite 
of Parkes digital filterbank systems (PDFBl, PDFB2, 
PDFB3 and PDFB4) (Manchester et al., in prepara- 
tion). Details of these instruments are listed in Ta- 
ble 2 providing the name of the backend and its la^ 
bel, the maximum bandwidth that the backend can 
process, whether it is used in "Search-mode" (S) or 
"Fold- mode" (F) , data span and the number of obser- 
vations included in the archive. The PDFB systems 
record all data as PSRFITS files. Data files from other 
instruments have been converted to PSRFITS before 
inclusion in the data archive. 



3 Data sets and data format 

Currently the archive contains data that have been re- 
covered from five observing projects. A summary of 
these data sets is given in Table 3 and details are pro- 
vided below. In Table 3 we provide the project name 
and reference (identifiers in bold represent continuing 
projects), N/ the number of raw data files currently in 
the database, the status of the project ('o' for on-going 
projects and 'c' for completed projects), the receiver 



and backend instrumentation used, typical individual 
file sizes and the date of the first and last observation 
stored in the archive^ 

All of the pulsar data stored in the data archive 
follow the PSRFITS standard (Hotan, van Straten & 
Manchester 2004)''. Each file contains a single obser- 
vation of a pulsar or a particular area of sky; for obser- 
vations using the 13-beam multibeam receiver, 13 sep- 
arate PSRFITS files are produced for each telescope 
pointing. We note that the PSRFITS definition al- 
lows the addition of new parameters when required 
and therefore older PSRFITS files may not include as 
much metadata as later files. Prior to Version 2.10 
the format was not fully compliant with Virtual Ob- 
servatory standards. We have therefore converted all 
such earlier files to the most up-to-date version of PSR- 
FITS. Even though a large number of parameters are 
stored in PSRFITS files many of these parameters are 
not useful as searchable metadata. In Table 4 we list 
the parameters that are recorded as part of the data 
archive and can be used in order to identify an obser- 
vation of interest (for instance, searches can be carried 
out on the telescope position, but not on the attenua- 
tor settings for that observation). Note that only the 
pulsar J2000 names are stored. We provide no facility 
to search on the older B1950 names. The ATNF Pul- 
sar Catalogue (Manchester et al. 2005)® can be used 
to determine a pulsar's J2000 name. 

Each file was obtained as part of a specific observ- 
ing programme that had been allocated observing time 
on a competitive basis. The relevant metadata describ- 
ing the project was obtained from the original observ- 
ing proposal requesting the use of the telescope. We 
store the proposal abstract and names of researchers 
included on the proposal. This was obtained and con- 
verted to ensure compliance with the VO protocols. 



^Note that data have not always been recorded with the 
correct project identifier. We recommend that, if possible, 
the project identification is confirmed with the observers 
before the data axe referenced in a publication. 

''http : //www . atnf . csiro . au/research/pulsar/index . 
php?n=Main . Psrf its 

®http : //www . atnf . csiro . au/research/pulsar/psrcat 
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Table 3: Data currently stored in the archive. 



Project 


ID 


N/ 


Status 


Receiver 


Backends 


Median file size 


Data span 


70cm pulsar survey 


P050 


42801 


c 


70CM 


AFB_32_256 


18 MB 


05/1991-12/1994 


Young pulsar tim- 


P262 


4512 


c 


MULTI, H-OH 


AFB 


0.3 MB 


02/1998-05/2007 


ing 

Swinburne Inter- 


P309 


70792 


c 


MULTI 


AFB 


25 MB 


06/1998-03/1999 


mediate latitude 














survey 
















Parkes Pulsar Tim- 


P456 


25610" 


o 


MULTI, H-OH, 


WBCORR, 


64° MB 


02/2004-10/2010 


ing Array 








1050CM 


PDFBl, PDFB2, 














PDFB3, PDFB4, 
















CPSR2, APSR 






PULSEQParkes 


P595 


329" 


o 


MULTI 


PDFB2, PDFB3, 


56" MB 


04/2008-11/2010 












PDFB4 





" Not including the calibration files 



Table 4: Searchable metadata stored for each file. 



Parameter label 



Description 



BACKEND 
BECONFIG 

DATE (CREATION-DATE)" 

DATE-OBS 

DEC (DEC.ANGLE)" 

FRONTEND 

HDRVER 

MJD 

NRCVR 

OBSBW 

OBSERVER 

OBSFREQ 

OBSNCHAN 

OBS.MODE 

PROJID 

RA (RA.ANGLE)" 

SRC.NAME 

STTJMJD 

STTXST 

STTJMJD 

STT.OFFS 

TELESCOP 



(FILENAME)" 
(FILESIZE)" 

(FILE.LAST.MODIFIED)" 

(OBRTYPE)" 

(OijS_LE,NGTll)" 



Backend instrument 

Backend configuration 

Date that the data file was created 

Date of observation (YYYY-MM-DDThh:mm:ss UTC) 

Declination (dms). For the Virtual Observatory the angle is given in degrees. 

Name of the receiver 

Version number for the PSRFITS format 

Start time MJD 

Number of receiver receptors 

Bandwidth for observation (MHz) 

Initials for observer who carried out the observation 

Central observing frequency (MHz) 

Number of frequency channels 

Pulsar, calibration or search 

Project identification code 

Right ascension (hms). For the Virtual Observatory the angle is given in degrees. 

Source name or scan identifier 

Integer part of the MJD for the observation 

Start Local Sidereal Time (LST) 

Start time (sec. past UTC OOh) 

Offset in the start time (seconds) 

Telescope used for observation (currently all set to PARKES) 



Name of the data file 
Size of the data file 

Date and time for when the file was created or last modified 

Type of (lata file (raw. ])re])roeessed or tlniiiibiiail iiiia,t];e) 
I'oLai leiigLli oi' oljser\"alioii (iii iiiilliseeoiid.s) 



If a different label is used within the PSRFITS file compared to Virtual Observatory searches then the Virtual Observatory 



label is given in parentheses. 
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J1539-5626 (rms = 9805.315 fj.s) post-fit 



2000 



2005 



Year 



Figure 1: Pulsar timing residuals for PSR 
J1539— 5626 from the young pulsar timing pro- 
gramme, P262. 

3.1 Modification of the data files 

The data^archiving policy is that no further modifica^ 
tions are made to the raw data files after conversion 
to the PSRFITS format. In some cases new header 
parameters become available after the conversion to 
PSRFITS and such header metadata are updated, but 
the raw data arc untouched. In rare cases it may be- 
come apparent that a mistake has been made in con- 
verting to PSRFITS from the raw tape or disk files. 
In such cases the data files will be replaced with cor- 
rected versions. The database stores information on 
when the last modification to any observation file has 
been made. 

3.2 Fold-mode observations 

3.2.1 Young pulsar timing (Project code: 
P262) 

Long-term pulsar timing projects that have concen- 
trated on pulsars with relatively small characteristic 
ages have been ongoing at Parkas for many years. Such 
projects have led to numerous publications on period 
glitches, pulsar timing irregularities and updated pul- 
sar timing ephemerides (e.g., Wang et al. 2000). Here 
we describe data from the P262 observing programme 
that was carried out between MJDs 50849 and 54224 
(from Feb. 1998 to May 2007). The data were recorded 
using the analogue filterbank system which records 
data in the search-mode format. As these observations 
were of known pulsars the majority of the processing 
starts by folding the search-mode data at the known 
period of the pulsar^. Data are available for 616 pul- 



®In a few cases it may be of interest to fold at a different 
period. This could be because other pulsars were observed 
within the beam, to check whether the correct pulsar pe- 
riod is known or because the pulsar has "glitched" implying 
that the most recent ephemeris is not suitable for folding 
the data. The original search mode data will be made avail- 
able, through this archive, at a later date and are currently 
available on request. 



sars and were processed as follows: 

• The original data files for all recoverable ob- 
servations from the P262 observing programme 
were obtained. 

• The source name was updated to provide the 
most up-to-date name as presented in the ATNF 
Pulsar Catalogue. 

• The data were folded at the known period (us- 
ing the most up-to-date pulsar ephemeris) of the 
pulsar using the DSPSR software (van Straten 
& Bailes 2010) and a fold-mode PSRFITS file 
output. 

In total, 4512 observations were recovered with a me- 
dian observation time of five minutes and a total ob- 
servation time of 597 hours. The observation filenames 
have a leading "f ' to indicate that they came from the 
analogue filterbank system followed by the date of the 
observation. An example filename is "f981007_044636.rf ' 
for an observation with a UTC start time of 1998 Oct 
7, 04''46'"36=. As these data were obtained using the 
analogue filterbank system we only provide total in- 
tensity profiles. 

After the discovery of a pulsar, it is common to 
carry out a small number of "gridding" observations in 
order to improve the pulsar's position to a fraction of 
the telescope beamwidth (Morris et al. 2002). For such 
observations the pulsar signal is often not observable, 
but such files can easily be identified as the telescope 
was not pointing directly at the pulsar. 

An example of the P262 data is shown in Figure 1. 
This Figure contains the timing residuals (for details 
on the pulsar timing method see, e.g., Hobbs et al. 
2006) obtained for a typical pulsar, PSR J1539-5626. 
For this pulsar 32 observations were observed as part 
of the P262 project over a period of 8.6 yr. The ar- 
rival time uncertainties are smaller than the symbol 
size in the figure and have a mean of 33/is. The timing 
model used to determine the pre-fit timing residuals 
was obtained from the pulsar ephemeris stored in the 
PSRFITS file. The data were first processed using the 
PSRCHIVE (Hotan, van Straten & Manchester 2004) 
software suite. First, the program PAZ was used to 
remove band edges and radio frequency interference 
(RFI) and pam was used to increase the signal-to-noise 
ratio by integrating over the frequency channels and 
integrations). Pulse times-of-arrival were obtained us- 
ing PAT and finally timing residuals determined using 
TEMP02 (Hobbs, Edwards & Manchester 2006). The 
timing residuals are typical of normal pulsars that ex- 
hibit timing noise (cf., Hobbs et al. 2010). 

3.2.2 The PULSEOParkes project (P595) 

The PULSE@Parkes project (Hobbs et al. 2009, Hol- 
low et al. 2008) has been designed to introduce high 
school students to astronomy. The students observe 
from a selection of '~-^40 pulsars that are chosen to be 
of interest for various scientific projects. The 20 cm 
multibeam receiver is used, giving an observing fre- 
quency close to 1400 MHz and a bandwidth of 256 MHz. 
Data have been recorded using the PDFB3 and PDFB4 
backend systems. Since the start of 2011, the PDFB3 
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Table 5: Pulsars observed as part of the 
PULSE@Parkcs (P595) observing project. 



PSR J 



Period 
(s) 



DM 

(cm^'^pc) 



JUUUD-+ 


1 QQ/1 

-ioo4 


u.oy4 


Iz.U 


D 


JUUo4- 


U / zl 


U.y4o 


1 1 QC 

11. oo 


1 Q 

lo 


JUiUo- 


1 /I QT 


U.oU ( 


Z.oo 


1 A 
14 


J0134- 


-2937 


0.137 


21.81 


10 


J0152- 


-1637 


0.833 


11.92 


10 


JUzUo- 


/inoQ 
-4Uzo 


U.Dol 


ion 

iz.y 


Ofi 
ZD 


JU4o i — 


-4 / io 


U.UUO 


o fit; 

Z.DO 


4U 


JU40Z- 




u . o4y 


QO nn 

oy.yu 


OZ 


J0729- 


-1836 


0.510 


61.29 


18 


J0742- 


'2822 


0.167 


73.78 


34 


Tnonn 
juyuu^ 


QT /I /I 


U . U i 1 


i 0. ( u 


1 fi 

ID 




-UOoo 


n /I QT 
U.4ol 


Z i .Z i 


1 n 
lU 




-uyoi 


i.uyo 


10.4 


1 n 
lU 


J1003- 


-4747 


0.307 


98.1 


52 


J1107- 


-5907 


0.253 


40.2 


54 


J 1 iZQ — 


OoZO 


fi nriQ 
U.UUo 


1Z4. ( o 


fi 

D 




'D4U / 


U.Zio 


y ( .4 / 


DO 

oZ 




-Z40O 


i.oftZ 


O O/l 

y .Z4 


fi 


J1300-f 


-1240 


0.006 


10.17 


6 


J1349- 


-6130 


0.259 


284.6 


26 


J1359- 


-6038 


0.128 


293.71 


22 


J1412- 


-6145 


0.315 


514.7 


24 


J1453- 


-6413 


0.179 


71.07 


16 


J1530- 


-5327 


0.279 


49.6 


10 


J1543- 


-0620 


0.709 


18.40 


4 


J1634- 


-5107 


0.507 


372.8 


12 


J1637- 


-4553 


0.119 


129.23 


8 


J1713-f 


-0747 


0.005 


15.99 


2 


J1717- 


-4054 


0.888 


307.09 


17 


J1721- 


-3532 


0.280 


496.0 


12 


J1726- 


-3530 


1.110 


727.00 


12 


J1807- 


-2715 


0.828 


312.98 


8 


J1818- 


-1422 


0.291 


622.0 


2 


J1829- 


-1751 


0.598 


84.44 


2 


J1820- 


-0427 


0.307 


217.10 


4 


J1830- 


-1059 


0.405 


161.50 


2 


J18324 


-0029 


0.534 


28.3 


6 


J19024 


-0615 


0.674 


502.90 


4 


J2053- 


-7200 


0.341 


17.3 


6 


J2145- 


-0750 


0.016 


9.00 


10 


J2317+1439 


0.003 


21.91 


12 



system has been used to produce a high signal-to-noise 
pulse profile and simultaneously the PDFB4 system 
has recorded in search mode to provide information on 

single pulses and the RFI environment. Observations 
are typically 2 to 15 min depending on the pulsar's flux 
density. A pulsed calibration signal is observed prior 
to each observation allowing each data set to be fully 
calibrated in polarisation and flux density. 

PULSEQParkcs is an ongoing project and more 
data become available each month. As this project 
primarily has an outreach goal, these data sets are im- 
mediately available for download. At the time of writ- 
ing we have 661 observations from a total of 41 pulsars 
(listed in Table 5 which gives each pulsar's name, pe- 
riod, dispersion measure and the number of observa- 
tions currently in the archive). As for the P262 data, 
file names indicate the date and time of the observa- 
tion. File names starting with an "r" correspond to 
PDFB2 data, "s" for PDFB3 data and "t" for PDFB4 
data. Folded pulsar archives have the file extension 
".rf " . Calibration source files have the extension ".cf " 
and observations obtained in search mode have ".sf". 
In total 29 GB of data arc currently available for down- 
load. Wo note that some of these pulsars are known 
to undergo extreme nulling events (during which the 
pulse disappears for many hours or days). Some ob- 
servations therefore seem to show no pulse. Many of 
the other pulsars are affected by scintillation and, be- 
cause of this, may have low signal-to-noise ratios in 
some observations. 

An example profile from the PULSE@Parkes project 
is shown in Figure 2. This pulse profile has been cali- 
brated using PAC in the PSRCHIVE software suite pro- 
viding both polarisation and flux calibration. An im- 
proved calibration method, described by van Straten 
(2004), uses feed cross-coupling data obtained using 
the program PCM. The right panel in Figure 2 shows 
the pulse profile calibrated using the cross-coupling 
data, which agrees with that published by Karasteriou 
& Johnston (2006). The differences between the two 
profiles in Figure 2 (particularly in Stokes V) highlight 
the importance of using careful calibration for observa- 
tions obtained using the 20 cm multibeam receiver. An 
example of recent search mode PULSE@Parkes data 
are shown in Figure 3 where six adjacent individual 
pulses from the intermittent pulsar PSR J1717— 4054 
are plotted. Many of the observations arc affected 
by radio-frequency interference, but tools are available 
within the PSRCHIVE software suite to remove much 
of this interference. 



3.2.3 The Parkes Pulsar Timing Array (P456) 

The Parkes Pulsar Timing Array (PPTA) project has 
the main aim of detecting gravitational wave signals 
(described in Verbiest et al. 2010, Hobbs et al. 2009 
and references therein). The main data collection for 
the project started in 2004 and is ongoing. Obser- 
vations are taken every ~ 3 weeks for 20 pulsars at 
three observing frequencies. Several backend instru- 
ments are run in parallel. This project makes exten- 
sive use of the 20 cm multibeam receiver and the dual- 
band 10/50cm receiver. Data have been recorded using 
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Figure 2: Profile for PSR J1359— 6038 obtained by Kelso High School students as part of the 
PULSE@Parkcs project. The profile in the left-hand panel has been calibrated using the standard 
PAC calibration method. The profile in the right-hand panel has been calibrated with compensation 
for cross-coupling in the 20 cm feed. The outer solid line represents Stokes I, the inner solid line the linear 
polarisation (with the position angle shown in the upper panel) and the dotted line shows Stokes V. 
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0.5 
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Figure 3: Single pulses from the intermittent pul- 
sar PSR J1717-4054 obtained by students of the 
German International School Sydney as part of the 
PULSE@Parkes project. 



an auto-correlation spectrometer (commonly referred 
to as the "wide-bandwidth correlator" and labelled as 
"WBCORR"), coherent dedispersion systems (CPSR2 
and APSR) and the digital filterbanks (PDFBl, PDFB2, 
PDFB3 and PDFB4). Observations at the same time 
and frequency for different backcnds contain the same 
information and cannot be used as two independent 
observations of the pulsar. Data are recorded with 
a large number of frequency channels and typically 
onc-minutc integrations. Polarisation information is 
available which can be calibrated to produce Stokes 
parameters. Files have the same naming convention 
as in the P595 data with CPSR2 data at different fre- 
quencies denoted by an "m" or "n" at the start of the 
filename. 

The PDFBl/2/3/4 and WBCORR systems directly 
produce PSRFITS data and we make no changes to 
the data files for inclusion into the archive. CPSR2 
produces individual files for cacli integration for each 
observation. We have combined these integrations into 
one PSRFITS file for each observation. We have ob- 
tained tlic relevant metadata for tlic observation using 
(in most cases) the licadcr information stored in simul- 
taneous PDFB or WBCORR files. 

Individual data files may be large. Typical re- 
cent one-hour observations of PSR J1022-I-1001 occupy 
1.1 GB. The total amount of data provided as part of 
the archive is STB and this is expected to grow by 
~lTB/ycar. The period and dispersion measure of 
the pulsars observed as part of the project arc given in 
Table 6 along with the total number of observations. 
In Figure 4 we show typical total intensity pulse pro- 
files in the 20 cm observing band for each pulsar. 

The data for this project can be used for numerous 
applications such as studying the polarisation prop- 
erties of the pulsars (Yan et al. 2011), pulse shape 
variability or dispersion measure variations (You et al. 
2007). However, getting the most from the data re- 
quires local knowledge of how the data were taken, 
issues with the backend systems during the observing, 
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Figure 4: Typical 20cm profiles from the PDFB4 backend for the Parkes Pulsar Timing Array pulsars 
obtained after a 1-hour observation. 



Table 6: Pulsars observed as part of the Parkes 



Pulsar Timing 


I Array (P456) observing 


PSR J 


Period 


DM 


N/ 




(ms) 


(cm~^pc) 




J0437-4715 


5.757 


2.65 


4871 


J0613-0200 


3.062 


38.78 


1372 


J0711-6830 


5.491 


18.41 


1313 


J1022+1001 


16.453 


10.25 


1539 


J1024-0719 


5.162 


6.49 


1207 


J1045-4509 


7.474 


58.15 


1131 


J1600-3053 


3.598 


52.19 


1372 


J1603-7202 


14.842 


38.05 


974 


J1643-1224 


4.622 


62.41 


835 


J1713+0747 


4.570 


15.99 


1054 


J1730-2304 


8.123 


9.61 


752 


J1732-5049 


5.313 


56.84 


597 


J1744-1134 


4.075 


3.14 


1144 


J1824-2452 


3.054 


119.86 


547 


J1857+0943 


5.362 


13.31 


668 


J1909-3744 


2.947 


10.39 


2167 


J1939+2134 


1.558 


71.04 


677 


J2124-3358 


4.931 


4.62 


1156 


J2129-5721 


3.726 


31.85 


841 


J2145-0750 


16.052 


9.00 


1115 



the local RFI environment, high quality standard tem- 
plates etc. This information is not provided as part of 
the data archive and we recommend that any users 
of these data sets obtain further information from the 
relevant PPTA papers (Verbiest et al. 2010, Hobbs et 
al. 2009 and references therein). 



3.3 Surveys 

3.3.1 The 70cm pulsar survey (P050) 

The 70 cm Southern-sky pulsar survey (Manchester et 
al. 1996, Lyne et al. 1998) led to the detection of 298 
pulsars, of which 101 were new discoveries. These dis- 
coveries included PSR J0437— 4715, the brightest mil- 
lisecond pulsar known. Each observation lasted 160 s 
and 1-bit data were recorded with a sample interval of 
300/xs. These survey observations were stored on ~ 600 
exabyte tapes. Some of these tapes are now unread- 
able, but, in total, wc succeeded in recovering 42750 
observations (93% of the total survey). Each observa- 
tion file is 18 MB in size giving a total data storage 
of 935 GB. In addition to the survey observations, the 
tape files included 4263 re-pointings toward 293 differ- 
ent pulsars. For each observation we have produced 
a single PSRFITS file. We have included various pa- 
rameters including the project code (P050), the label 
for the front-end receiver (70CM) and source name (ei- 
ther the pulsar name, or the pointing identifier) in the 
PSRFITS file. 

In order to confirm that we have successfully con- 
verted the files to the PSRFITS format we have com- 
pared the results for a selection of observations bit-by- 
bit with the results obtained using the program, SC_td, 
which was used during the original processing of the 
data. No discrepancies were found. Wc have repro- 
cessed all data using the search algorithm being used 
for the current Parkes HTRU pulsar survey (Keith et 
al. 2010). All previously detected pulsars have been 
re-detected using the data stored in the archive. 

We note that all of the search mode data sets are in 
their original form and therefore contain imperfections, 
such as radio frequency interference. For instance, we 
show in Figure 5, approximately 40 seconds of data for 
a typical observation. The grey-scale image provides 
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Figure 5: Approximately 40 seconds of data from 
the 70 cm Parkes pulsar survey. The high fre- 
quency channels in these data are affected by un- 
explained interference. 



the intensity as a function of time and frequency. It 
is clear that radio frequency interference is affecting 
the highest frequency channels (around a frequency of 

450 MHz) . Such interference needs to be identified and 
removed before standard search algorithms are applied 
to the data. 

3.3.2 The Swinburne Intermediate latitude 
survey (P309) 

These data are from a large survey for pulsars at high 

Galactic latitudes (Edwards, Bailes, van Straton & 
Britton 2001). The survey covered ~ 4150 square de- 
grees in the region -100° < I < 50° and 5° < |6| < 15° 
with 4702 pointings of the 13 beam receiver (provid- 
ing 61126 individual files) each of 265 sec. In total, 
170 pulsars were detected of which 69 were new dis- 
coveries. The raw data for this project are stored on 
Digital Linear Tape (DLT) at Swinburne University of 
Technology. We were provided with data files for each 
observation that had been processed using the SC_td 
software package. We converted each beam of each 
pointing to a single PSRFITS file and compared the 
converted files with the original files to ensure that the 
raw data was unchanged during the conversion process. 
The PSRFITS header parameters were updated with 
the project code (P309), the telescope (PARKES), the 
receiver (MULTI) and the beam corresponding to the 
observation. 

This programme has 70792 observations stored in 
the archive. These include most of the original sur- 
vey observations and re-pointings toward detected pul- 
sars. For survey observations the source name is set 
to "Unknown" and the pointing identification is set to 
a specific value unique to that particular observation. 
In Figure 6 we plot the position of each observation 
that has been recovered overlaid on the positions of all 




Figure 6: Galactic coordinates for the Swinburne 
Intermediate Latitude Survey are indicated as bold 
points. The area of the sky under the solid line is 

where the Parkes 70 cm was conducted. The small 
dots are the positions of known pulsars. 

known pulsars. 

4 Obtaining the data 
4.1 Data access portals 

The Parkes pulsar data archive can be accessed through 

various portals. The Australia National Data Ser- 
vice (ANDS) portal, called Research Data Australia 
(RDA),^ is used to search descriptions of data col- 
lections. CSIRO provides a data access portal* in- 
tended for use by professional astronomers to search 
for, and download, small numbers of data files. The 
PULSE@Parkes portal^ makes the data accessible to 
the broader community. Virtual Observatory tools can 
also be used to query the database. 

4.1.1 Research Data Australia portal 

The Australia National Data Service (ANDS) intends 
to present information about, and access to, as much 
Australian research data as possible in a common man- 
ner. This portal can be used in order to obtain in- 
formation about various pulsar projects and data col- 
lections. For instance, a user can search for "astro- 
nomical data" and then obtain information on e.g., 
the P456 Parkes Pulsar Timing Array project. Note 
that this portal will not allow queries based on obser- 
vational parameters such as the source name or posi- 
tion. The emphasis of Research Data Australia (RDA) 
is on discovering the existence of collections of data, 
with discipline-specific queries being handled by spe- 
cific portals such as those described below. An exam- 

'^http : //www . ands . org . au; http : //services . ands . 
org . au/home/ or ca/rda/ . 

*http : //datanet . csiro . au/dap/ 

^http : / /outreach . atnf . csiro . au/ education/ 

pulseatparkes/ 
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Figure 7: Example screenshot from the ANDS por- 
tal that provides access to information about indi- 
vidual projects. 



pie is shown in Figure 7 where information is provided 
on the P456 project. Note that the CSIRO Data Ac- 
cess Portal (described in §4.1.2) provides links to the 
relevant parts of the Research Data Australia website. 

4.1.2 The CSIRO Data Access Portal 

The CSIRO data access portal provides an interface 
to data sets including the Parkes pulsar observations. 
This system allows searching on pulsar name, project 
identification or areas of the sky. An example screen- 
shot is shown in Figure 8. This portal provides a means 
to download a small number of individual files from 
the archive. Typical usage would be to search for a 
particular pulsar name (e.g., "J0437-4715"). At the 
time of writing, this returns 10008 files stored in the 
database. These are divided into the original data 
files (5112 files) and pre-processed files (4896 files). 
A panel is presented providing a basic description of 
these files (e.g,. 1072 observations were obtained us- 
ing the PDFBl system and 40 of these observations 
were obtained as part of the PULSE@Parkes project). 
The user can then filter these results to obtain, for 
instance, only PULSE@Parkes observations, obtained 
with the PDFB4 backend instrumentation. This re- 
duces the number of files to 10 which can be selected 
for download. 

Most of the fold-mode observations have correspond- 
ing pre-processed files that have been summed in po- 
larisation, frequency and time. These pre-processed 
files are significantly smaller than the raw observations 
and can be used for many purposes. However, it will 
not be possible to undertake any high-precision pulsar 
timing, frequency-dependent investigations nor analy- 
sis of the pulse polarisation using such data. Thumb- 
nail images of these pre-processed files are available. 
These should be viewed before a file is selected for 
download to ensure that the data quality is sufficient 



for the project being undertaken. 

If required, calibration files can also be downloaded. 
As calibration files may have been obtained before or 
after the pulsar observation, the CSIRO data access 
portal provides the ability to download all calibration 
files within a specified time range before or after the 
start of the pulsar observation. 

With a few exceptions, observations from the Parkes 
radio telescope are embargoed for a period of 18 months 
from the time that the data were obtained. The CSIRO 
access portal is the only generally accessible means by 
which files can currently be downloaded and therefore 
requires the user to provide a user name and password 
if embargoed data are required. An individual who is 
part of an observing project can log on to the portal 
using the account that they used to submit or view 
their observing proposal. 

4.1.3 The PULSE@Parkes portal 

Simplified versions of the PULSEQParkes data sets are 
also available from the project website. This website 
provides images of each observation and the data in a 
simple text form that can loaded into a spreadsheet. 
A simple web interface allows the data to be processed 
online to determine the pulsar dispersion measures and 
characteristic ages. New online educational modules 
using these data sets will become available in the fu- 
ture. 

4.1.4 The Virtual Observatory Interface 

The Virtual Observatory (VO) allows a user to com- 
bine and compare a large number of different data sets. 
A diverse range of astronomical catalogues and images 
are already available through the VO including pulsar 
catalogues and the tables of pulsar parameters that 
have been included in recent publications. The Inter- 
national Virtual Observatory Alliance (IVOA) defines 
standards and protocols that enable astronomers to 
compare and cross-correlate these data sets in a con- 
sistent manner. A number of VO compatible tools al- 
ready exist to find, query, manipulate such data. Tools 
also exist to process VO data via scripting languages 
(e.g., voclient). 

It is possible to query the metadata that provides 
information about each pulsar observation using VO 
tools. Both cone-searches (allowing searches in po- 
sition) and queries in the Astronomical Data Query 
Language (ADQL) are implemented. An example use- 
case would be to obtain a listing (in HTML, CSV or 
the more flexible VOTable format) of all files in the 
archive that were obtained in survey mode^°. The re- 
sulting VOTable can be loaded into virtual observatory 
packages (such as TOPCAT; Taylor 2005). Figure 9 
shows a TOPCAT display of the coordinates for all the 
observations in the 70 cm pulsar survey. A "multi cone 
search" can then be run to match these search mode 
observations with, e.g., known pulsar positions from 
the ATNF pulsar catalogue (Manchester et al. 2005), 
or e.g., the AGILE catalogue of gamma-ray sources 



ADQL is based upon a subset of SQL92 with extensions 
for astronomical usage. 
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Figure 8: Example screenshot from the CSIRO data access (pulsar) portal. The top panel allows the user 
to select sky-positions, a pulsar name, project identifier or date range to restrict the search results. The 
panel on the left divides the search results into various subsections. The bottom panel shows the result 
from a search and the thumbnail image gives an indication of the data quality. 




4.1.5 Large data sets 

The current data archive stores ~5TB of data. The 
amount of data stored will increase rapidly as the data 
from more observing programmes are added. It is 
clearly not possible to download a significant part of 
this archive using the online portals (currently a re- 
striction of 50 files is placed on any individual down- 
load). We are planning new approaches to allow ac- 
cess to such large data files using high performance 
computing infrastructure, but this has not yet been 
implemented. Instead, for folded data sets the user 
may wish to obtain pre-processed files, which will avoid 
long download times. The CSIRO data access portal 
provides the option to download the original or the 
pre-processed files. 



Figure 9: Example screenshot from using the vir- 
tual observatory package TOPCAT. This shows 
the positions (on the celestial sphere) of all ob- 
servations for the 70 cm pulsar survey. 



(Pittori et al. 2009)^^. One obvious possibility would 
be to select all pulsars with a specific property of inter- 
est from the ATNF pulsar catalogue (such as pulsars 
with high magnetic field strengths) and then use the 
virtual observatory tools to identify observations avail- 
able for download that may help to study this class of 
pulsar. 



^^Such a search can be carried out in TOPCAT by load- 
ing the resulting VO table from the ADQL query and then 
carrying out a multiple cone search with any of the cata- 
logues that are currently in VO format. 



5 Using the data 

As each data file is stored in PSRFITS format, much 
of the standard software for processing FITS files can 
be used. For instance, the archiving software itself 
uses the nom.tam Java library for reading the files^'^. 
The NASA High Energy Astrophysics Science Archive 
Research Centre^^ provides many other tools that can 
be used. Available utility programs that work with 
PSRFITS include, 

• LISTHEAD - This utility provides a listing of header 
parameters within the file. 

• FITSCOPY - Provides routines to copy FITS files 
(note that most options are not relevant for pul- 
sar data) 

^^http : //heasarc . gsfc .nasa.gov/docs/heasarc/fits/ 
java/vO . 9/javadoc/ 

^^http : //heasarc . gsfc .nasa.gov/docs/heasarc/fits . 
html 
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• LISTSTRUC - Lists the formatting internal to the 
FITS file (provides details on which parameters 
are stored as strings, integers, floating point, 
etc.) 

• MODHEAD - Displays or modifies a header key- 
word. For instance, this can be used to change 

the pulsar's name that is stored in the file. For 
fold-mode files, the PSRCHIVE tool psredit 
can also be used for this purpose. 

• TABLIST - displays the contents of a FITS ta- 
ble. This utility can be used to display tabular 
information from the FITS file; for instance, to 
determine the parallactic angle for each integral 
tion. 

• TABCALC - allows simple calculations to be per- 
formed on tables within the FITS file. Columns 
may be overwritten or new columns created. A 
new FITS file is created. 

• FV provides a graphical interface allowing the 
various header parameters and tables to be in- 
spected by eye (and, if required, modified). FV 
also provides simple plotting routines. This is 
part of the much larger FTOOLS package which 
can be downloaded in its entirety. 

In general, only tools that work with general FITS data 
files arc compatible with PSRFITS. Utility programs 
that work with FITS images, e.g., SAOIMAGE, DS9, 
IMLIST, will not be compatible. 

All fold-mode files can be processed using the PSR- 
CHIVE software suite. A common sequence of process- 
ing steps would be to 1) download the data file using 
the CSIRO data access portal, 2) use PAZ and/or pazi 
to remove RFI, 3) PAC to calibrate the profile, 4) pam 
to produce a single pulse profile integrated in observing 
frequency and over all integrations, 5) PAV to view the 
pulse profile and 5) pat to obtain pulse times-of-arrival 
which can be processed using tempo2. 

Search mode files can be processed using the DSPSR 
(van Straten & Bailcs 2010) or SlGPROc'^* software 
packages. SIGPROC provides various tools for plotting 
the data or for searching for new pulsars, dspsr allows 
the raw data to be displayed (using searchplot) or to 
be folded with a given period to form a folded profile 
(using dspsr). 

5.1 Ancillary files 

The data archive provides access only to the observer- 
tion data files. In order to process these files it may 
be necessary to obtain extra data files relevant to the 
Parkes observatory. For instance, the pulsar timing 
method requires that the clock used at the observa- 
tory to measure the pulse arrival times be converted 
to a realisation of terrestrial time. This conversion is 
provided in a set of "clock correction files" that cam be 
obtained as part of the tempo2 distribution or from 

'^^httpi/Zsigproc. sourceforge.net/. Note that only 
the most recent version of SIGPROC is compatible with PSR- 
FITS. It is expected that the next version of the presto 
search-mode pELckage will also be compatible with our data 
files. 



the pulsar web site . Other useful files, such as mear- 
surements of the time delays between different backend 
instrumentation, may also be obtained from this web- 
site. 

5.2 Referencing the database 

Much of the data available from the archive is from 
on-going projects. Even though all data older than 18 
months is out of any embargo period we recommend 
that the people who carried out the observations are 
contacted before extensive use is made of the data as 
each data set has its own peculiarities that may need 
to be understood. 

Any publication containing these data sets should 
refer to the original paper describing the data sets. We 
would also appreciate a reference to the portal used 
to download the data and/or a reference to this pa- 
per. It is a requirement of the Australia Telescope Nar 
tional Facility that any publication making use of the 
Parkes data includes a specific acknowledgement that 
is listed on the CSIRO Astronomy and Space Science 
webpage^® . 

6 The future 

The initial data archive provides observations obtained 
from five observing programmes. More than 300 dif- 
ferent observing programmes relating to pulsars have 
been undertaken at the Parkes observatory and pulsar 
observations currently take up two-thirds of the to- 
tal time on the telescope. Work is on-going to ensure 
that all future observations are included in the archive. 
Owing to the volume of data it is unlikely that, in the 
near future, we will provide the data from an on-going 
Parkes pulsar survey (Keith et al. 2010). When com- 
pleted, this survey will require more than IPB of data 
storage. We are currently attempting to identify the 
means by which such large data sets could be stored, 
accessed and processed. 

After the software has been developed to include 
current observations in the archive, we will recover as 
many existing data sets as possible. The choice of 
which new observations to add into the archive de- 
pends upon data storage requirements and the acces- 
sibility of the data. It is likely that the next major 
data sets to be added will be 1) the Parkes multibeam 
survey, which discovered about half of all the known 
pulsars (Manchester et al. 2001), 2) the timing ob- 
servations relating to new discoveries from this survey 
(Lorimer et al. 2006, Faulkner et al. 2004, Hobbs 
et al. 2004, Kramer et al. 2003, Morris et al. 2002, 
Manchester et al. 2001) and 3) the timing observations 
being carried out as part of the Fermi gamma-ray mis- 
sion (Weltevrede et al. 2010). A list of the data sets 
currently available is on our website^'^. 

In the near future, it is likely that observations 
fi:om a 12-m antenna commissioned in 2008 at the 

^^http: //www. atnf . csiro . au/research/pulsar 
^®http : //www . atnf . csiro . au/research/publications 
http://www.atnf .csiro.au/research/pulsar/index. 
php?n=Main . ANDSATKF 



www.pu blish.csiro.au /joumals/pasa 
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Parkes Observatory as a test-bed for new technology 
receivers for the Australian Square Kilometre Array 
Pathfinder (ASKAP) will be included as part of the 
archive. In the longer term it is possible that our data 
archive will merge with the Australia Telescope On- 
line Archivc^*^ and provide observations from Parkes, 
the Australia Telescope Compact Array and the Mo- 
pra telescopes. 

7 Conclusions 

Observations at the Parkes radio telescope have led to 

numerous discoveries relating to pulsar astrophysics. 
The data archive described here allows, for the first 
time, access to many of the original observations that 
were used in making these discoveries. It is hoped 
that this new resource will be used for numerous sci- 
entific projects including long-term pulsar timing ex- 
periments, discovering new pulsars in existing data sets 
and to provide an archive of high time-resolution data 
allowing new and unexpected discoveries. 
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